Pii: S0893-6080(99)00043-x
نویسندگان
چکیده
Mixture of experts (ME) is a modular neural network architecture for supervised learning. A double-loop Expectation-Maximization (EM) algorithm has been introduced to the ME architecture for adjusting the parameters and the iteratively reweighted least squares (IRLS) algorithm is used to perform maximization in the inner loop [Jordan, M.I., Jacobs, R.A. (1994). Hierarchical mixture of experts and the EM algorithm, Neural Computation, 6(2), 181–214]. However, it is reported in literature that the IRLS algorithm is of instability and the ME architecture trained by the EM algorithm, where IRLS algorithm is used in the inner loop, often produces the poor performance in multiclass classification. In this paper, the reason of this instability is explored. We find out that due to an implicitly imposed incorrect assumption on parameter independence in multiclass classification, an incomplete Hessian matrix is used in that IRLS algorithm. Based on this finding, we apply the Newton–Raphson method to the inner loop of the EM algorithm in the case of multiclass classification, where the exact Hessian matrix is adopted. To tackle the expensive computation of the Hessian matrix and its inverse, we propose an approximation to the Newton– Raphson algorithm based on a so-called generalized Bernoulli density. The Newton–Raphson algorithm and its approximation have been applied to synthetic data, benchmark, and real-world multiclass classification tasks. For comparison, the IRLS algorithm and a quasi-Newton algorithm called BFGS have also been applied to the same tasks. Simulation results have shown that the use of the proposed learning algorithms avoids the instability problem and makes the ME architecture produce good performance in multiclass classification. In particular, our approximation algorithm leads to fast learning. In addition, the limitation of our approximation algorithm is also empirically investigated in this paper. q 1999 Published by Elsevier Science Ltd. All rights reserved.
منابع مشابه
Pii: S0893-6080(00)00043-5
It is demonstrated that rotational invariance and reflection symmetry of image classifiers lead to a reduction in the number of free parameters in the classifier. When used in adaptive detectors, e.g. neural networks, this may be used to decrease the number of training samples necessary to learn a given classification task, or to improve generalization of the neural network. Notably, the symmet...
متن کاملCellular, synaptic and network effects of neuromodulation
All network dynamics emerge from the complex interaction between the intrinsic membrane properties of network neurons and their synaptic connections. Nervous systems contain numerous amines and neuropeptides that function to both modulate the strength of synaptic connections and the intrinsic properties of network neurons. Consequently network dynamics can be tuned and configured in different w...
متن کاملZero-lag synchronous dynamics in triplets of interconnected cortical areas
Oscillatory and synchronized activities involving widespread populations of neurons in neocortex are associated with the execution of complex sensorimotor tasks and have been proposed to participate in the 'binding' of sensory attributes during perceptual synthesis. How the brain constructs these coherent firing patterns remains largely unknown. Several mechanisms of intracortical synchronizati...
متن کاملStatistical estimation of the number of hidden units for feedforward neural networks
The number of required hidden units is statistically estimated for feedforward neural networks that are constructed by adding hidden units one by one. The output error decreases with the number of hidden units by an almost constant rate, if each appropriate hidden unit is selected out of a great number of candidate units. The expected value of the maximum decrease per hidden unit is estimated t...
متن کاملPii: S0893-6080(99)00042-8
This paper presents a theoretical analysis on the asymptotic memory capacity of the generalized Hopfield network. The perceptron learning scheme is proposed to store sample patterns as the stable states in a generalized Hopfield network. We have obtained that n 2 1 and 2n are a lower and an upper bound of the asymptotic memory capacity of the network of n neurons, respectively, which shows th...
متن کاملPii: S0893-6080(99)00058-1
The aim of the paper is to investigate the application of control schemes based on “internal models” to the stabilization of the standing posture. The computational complexities of the control problems are analyzed, showing that muscle stiffness alone is insufficient to carry out the task. The paper also re-visits the concept of the cerebellum as a Smith’s predictor. q 1999 Elsevier Science Ltd...
متن کامل